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QUEUE BASED MULTI-LEVEL AQM WITH DROP 
PRECEDENCE DIFFERENTIATION 

RELATED U.S. APPLICATION DATA 

[0001] This patent application is a continuation-in-part application of U.S. Patent 
Application No. 09/455,445 (Attorney Docket No. 10407RO) filed Dec. 6, 1999 and U.S. 
Patent Application No. 10/633,459 (Attorney Docket No. 43049-0007) filed August 1, 
2003; the contents of which are hereby incorporated by reference. 

FIELD OF THE INVENTION 

[0002] The present invention relates to network queue management and Is 
particularly concerned with queue based multi-level active queue management with drop- 
precedence differentiation.; - ^ u 

BACKGROUND OF THE INVENTION 

[0003] New applications and user requirements are driving the need for a network 
architecture that is both flexible and capable of differentiating between the needs of 
different applications and users. Different applications and users are increasingly 
demanding different quality of service (QoS) and network usage rates. The current 
Internet and most private corporate networks, however, offer best-effort service to traffic. 
To address these new demands, the Internet Engineering Task Force (IETF) has been 
looking at a number of architectural enhancements that will enable networks to provide 
service differentiation based on application and user needs. One of these efforts has 
resulted in the differentiated services (DiffServ) architecture. 

[0004] In the DiffServ architecture, an edge device (gateway, router, switch, 
server, etc.) would classify/mark packets Into several behaviour aggregates according to 
their diverse QoS requirements such as bandwidth, delay, and packet drop precedence. 
A DiffServ domain refers to a contiguous set of nodes operating with a common set of 
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service provisioning policies and per-hop-beliaviours (PHB) definitions. Per-domain 
services are realized by traffic conditioning at the edge and simple differentiated 
forwarding at the core of the network. Packets, are marked with the appropriate DiffServ 
codepoint (DSCP) at the edge of the network and within the core of the network, the 
network nodes (routers, switches, etc.) simply fonA^ard packets based on the PHB 
associated with the DSCP. An end-to-end differentiated sen/ice is obtained by 
concatenation of per-DiffServ domain services. The underlying goal of the DiffServ 
architecture is to address the scalability issue regarding per-flow service differentiation 
in the core of the network. In DiffServ, the core network elements do not necessarily 
have to implement complex resource reservation, scheduling, processing and 
classification mechanisms in addition to maintaining state information for each individual 
trafflc flow. The architecture allows network designers to push most^of the state and 
forwarding complexity to the edge of the network. . . . i , - v 

[0005] . The IETF' decided to specify particular fonvarding behaviours (the PHB)' 
rather than standardizing services. This is to allow service providers the freedom to 
construct services that meet their customers' needs. The IETF has currently 
standardized two PHB groups, namely, Expedited FonA/arding (EF) PHB and the 
Assured FonA/arding (AF) PHB. Packets marked for the EF PHB receive a premium 
service with low loss, low latency, low jitter, and guaranteed bandwidth end-to-end. The 
AF PHB, on the other hand, specifies four independent classes of forwarding (each of 
which can be assigned a certain amount of fonvarding resources, i.e., buffer space and 
bandwidth) and three levels of drop precedence per class. The three drop precedence 
levels are also referred in terms of color (in the order of high to low priority or 
precedence) as green, yellow, and red. During network congestion, an AF-compliant 
DiffServ node drops low precedence (red) packets in preference to higher precedence 
(green, yellow) packets. 

[0006] With the AF PHB, packets entering the network are marked with the goal 
of assigning a low drop probability to the traffic that fits within the subscribed profile and 
a higher drop probability to the excess traffic. During congestion at a node, packets 
marked with higher drop probability are preferentially dropped in order to make buffer 



room for packets marked with the lowest drop probability (which may be dropped only in 
the case of severe congestion). For these reasons, the packet drop mechanism at the 
node plays an important role in the quality of service offered to the user. 
[0007] Active queue management (AQM) has been proposed as a means to 
provide some congestion control as well as some notion of QoS to users. One important 
class of AQM is based on randomized packet dropping or marking. With this form of 
AQM, a network node drops each amving packet with a certain probability, where the 
exact p robability i s a function of t he average q ueue s ize ( or a ny s uitable i ndicator of 
network congestion such as rate mismatch at a node). With AQM, dropping a packet is a 
way of signalling congestion to adaptive traffic sources such as those running transport 
protocols like the Transmission Control Protocol (TCP). By dropping a packet, the node 
sends an implicit indication to the traffic source that congestion is experienced at some 
ipoint alongvthe end-to-end path to the destination. The traffic source :.is-<e?cpected to 
respond tQ :this indication by reducing its transmission rate so that the network node 
buffer- does not overflow. An important benefit of AQM is that it avoids the global 
synchronization of many traffic sources decreasing and increasing their window at the 
same time. The random early detection (RED) algorithm is a well known example of an 
AQM algorithm. 

[0008] The difficulty in configuring or tuning the RED parameters has resulted in 
the need for development of alternatives to RED. In view of the foregoing, it would be 
desirable to provide a technique for network queue management which overcomes the 
above-described inadequacies and shortcomings by providing a mechanism which does 
not react to short-term burst traffic and allows each precedence level to be addressed 
differently, while providing lower parameter configuration complexity and greater ease of 
configuration over a wide range of network conditions. 

SUMMARY OF THE INVENTION 

[0009] An object of the present invention is to provide an improved queue based 
multi-level active queue management with drop precedence differentiation. 
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[0010] According to an aspect of the present invention, there is provided a 
method for controlling a data flow in a data network at an element having a queue, 
starting with the step of specifying a plurality of precedence grades, each of the 
precedence grades having an associated priority. Next, for each precedence grade, 
calculating a cumulative queue size q(n) where q(n) is the sum of the queue size for a 
particular precedence grade under consideration plus the queue sizes of all precedence 
grades with a higher priority than said particular precedence grade under consideration. 
Then, for each precedence grade calculating an error signal e(n), according to the 
relation e(n) = (q(n) - T), where T is an assigned precedence grade queue capacity at 
time n. Following this, there is computed for each precedence grade a mark/drop 
probability p(n) according to the relation: 

p(n) = min { max [p(n-1 ) + a • e(n) / 2T, 0 ] , 9 } 
-where (i-is control gain, and 0 < 6 < 1 ; and subsequently, for each precedence grade 
exebutihg a packet mark/drop routine based upon the calculated marK/drbp probability 
p(n). A convenient number of precedence grades is three. 

[0011] Conveniently, the queue size may be filtered by use of an exponentially 
weighted moving average? scheme according to the relation: 
q'(n)= (1.p)q'(n.1) + Pq(n) 

where p is a filter gain parameter such that 0 < p < 1, 

q'(n-1 ) is the filtered queue size at time n-1 , 

q'(n) is the desired filtered queue size at time n, and 

q(n) is the cumulative queue size at time n. 
[0012] According to another aspect of the invention, preceding the packet 
mark/drop routine may be a bypassing routine Involving the steps of for each 
precedence grade testing the cumulative queue size q(n) against a queue threshold L 
specific to that precedence grade; and if the cumulative queue size q(n) is below or 
equal to the queue threshold L then bypassing the step of executing a packet mark/drop 
routine for that precedence grade. 
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[0013] Conveniently, the packet mark/drop routine may be realized according to a 
random number generator mark/drop scheme. 

[0014] In accordance with another other aspect of the present invention, there is 
provided an apparatus for controlling a data flow in a data network having a plurality of 
precedence grades, where each of the precedence grades has a priority associated with 
it. The apparatus has a cumulative queue size calculator for calculating a cumulative 
queue size q(n) associated with each of the plurality of precedence grades, wherein q(n) 
is the sum of the queue sizes for a particular precedence grade under consideration plus 
the queue sizes of all precedence grades with a higher priority than said particular 
precedence grade under consideration. The apparatus further contains an error signal 
calculator for calculating an error signal e(n) for each of said plurality of precedence 
grades according to the relation: 
e(n) = (q(n)-T), 

,where T is an assigned precedence grade queue capacity at tinrie n:/.Further, the 
apparatus has a mark/drop probability processor for computing a mark/drop probability 
p(n) for each of said plurality of precedence grades according to the relation; 

p(n) = min { max [p(n-1 ) + a • e(n) / 2T. 0 ] , 0 } 
where a is a control gain, and 0 < 0 < 1 ; and a packet mark/drop module for executing a 
packet mark/drop routine based upon the calculated mark/drop probability p(n). 
[0015] In accordance with another other aspect of the present invention, there is 
provided an article of manufacture carrying instructions for a method for queue based 
multi-level active queue management with drop precedence differentiation in a data 
network and, further, there is provided a signal embodied in a carrier wave representing 
instructions for a method for queue based multi-level active queue management with 
drop precedence differentiation in a data network according to an integral control 
scheme, 

[0016] The method and apparatus of the invention serve to integrate congestion 
control features into the differentiated service architecture. Advantages of the present 
invention include the lower parameter configuration complexity and the ease of 
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configuration it offers over a wide range of network conditions. Tlie invention has the 
further goal of maintaining stabilized network queues, thereby minimizing the 
occurrences of queue overflows and underflows, and providing high system utilization. 
[0017] The present invention will now be described in more detail with reference 
to exemplary embodiments thereof as shown in the appended drawings. While the 
present invention is described below with reference to the preferred embodiments, it 
should be understood that the present invention is not limited thereto. Those of ordinary 
skill in the art having access to the teachings herein will recognize additional 
implementations, modifications, and embodiments which are within the scope of the 
present invention as disclosed and claimed herein. 



BRIEF DESCRIPTION OF THE DRAWINGS 



[0018] The invention will be further understood from the following detailed 
description of embodiments of the invention and accompanying drawings in which: 

[0019] FIG- 1 is a block diagram of a mark/drop probability computation routine 

according to an embodiment of the invention. 
[0020] FIG. 2 is a block diagram of a packet drop/mark routine according to an 

embodiment of the invention. 
[0021] FIG. 3 is a plot of the relative relationship of three queue sizes relative to a 

threshold according to an embodiment of the invention. 
[0022] FIG. 4 is a plot of the relative relationship of three drop probabilities 

according to an embodiment of the invention. 
[0023] FIG. 5 is a block diagram of a mark/drop probability computation routine 

according to an alternative embodiment of the invention. 
[0024] FIG. 6 is a block diagram of a packet mark/drop routine according to an 

altemative embodiment of the invention. 
[0025] FIG. 7 is a block diagram of a packet dequeuing routine according to an 

alternative embodiment of the invention. 

e 



DETAILED DESCRIPTION 



[0026] The description that follows describes a multi-level active queue 
management scheme with drop precedence differentiation which maintains stabilized 
network queues, thereby minimizing the occurrences of queue overflows and 
underflows, and concurrently providing high system utilization, 

[0027] According to an embodiment of the invention there is a method that uses a 
simple feedback control approach to randomly discard packets with a load-dependent 
probability when a buffer in a network device gets congested. The method maintains the 
average queue size close to a predetermined threshold, but allows transient traffic 
bursts to be queued without unnecessary packet drops. Following is a brief overview of 
the packet drop probability computations required for the method. 

[0028] In this embodiment, congestion is controlled by randomly dropping packets 
in relation to a probability. This dropping of packets constitutes a signal to applications 
(TCP sources) to reduce their sending rate. The method takes a decision of dropping or 
accepting an incoming packet so that the queue occupancy level is kept at a given 
target level, thereby eliminating, as much as possible, buffer underflow and overflow. 

[0029] The actual queue size in the network device is assumed to be sampled 
every units of time (seconds), and the algorithm provides a new value of the drop 
probability every M units of time. The parameter A/ is the sampling/control inten/al 
of the system. The control system will be described in discrete time. 

[0030] Let q(n) denote the actual queue size at discrete time n, where 
n = lA/, 2Ar, 3Ar,K and T the target buffer occupancy. What is required is to determine a 

drop probability which will drive the filtered queue size to this target buffer 

occupancy. So it is necessary to adapt p^ to react to the filtered queue dynamics 

experienced at the node using the following control method: if the filtered queue size q 
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is smaller than the target queue size T, is decreased to make more aggressive 
usage of the available resources, and vice versa if the filtered queue size is high. 

[0031] The goal of the controller is therefore to adapt so that the magnitude of 
the error signal 

e{n)^q(n)-T 
is kept as small as possible. 

[0032] The filtered queue size can be obtained, for example, using a simple 
exponentially weighted moving average (EWMA) filter, 

q{n)^{\-P)q{n-\)^Pq{n), Q<fi<U 

[0033] The control mechanism is then obtained as the incremental adaptation of 
the drop probability p^ proportional to the error signal 

[0034] where cr is a control gain. 

[0035] Note that pj(n), as a probability, is always bounded by 0<pj(n)<l, for 
all n. 

[0036] Using a basic recursion of 

P din) = Pain -\) + ae{n) 
Implements the standard summation or integral control scheme since 

APd («) = Pd («) -Pdip-\) = ae{n) or 

Pdin)=aY!f^e(i), 

In discrete-time (and dpj(t)/dt = ae(t) or /^^ (0 = a |e(T)rfT, in continuous-time). 

[0037] In the method implemented in this embodiment, the normalized error 
signal is used instead, resulting in the control equation 

[0038] p^(n) = pAn-l) + cx^, 

[0039] where the temn 21 serves only as a nonnalization parameter. 
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[0040] Filtering o f t he q ueue s ize q h as t he i mportant b enefit of a llowing t raffle 
bursts to be queued without being unnecessarily discarded. This is because congestion 
is detected by comparing the average queue size to a pre-determined threshold. In 
effect, only the average queue size is controlled, allowing transient traffic bursts to be 
accommodated in the queue. 

[0041] FIG. 1 represents a flowchart of the drop probability computations. The 
parameter 0<l, initialized at step 101 in FIG. 1 is an optional upper bound on the drop 
probability p^. As shown in FIG. 1 the process initiates, at step 101, at discrete time 

n=0, by initializing certain parameters. A timer is set to At time units, a control gain a is 
established, and mark/drop probability p^{n) , and initial queue size q(n) are set to initial 
values. The initial mark/drop probability is used in the mark/drop routine until further 
samples are available.^ At step 103, the timer is reset to At time units td advance to :thiS'^ 
next discrete time interval. Then at step 105, the current queue size q(n) is:measured;^^>^i 

[0042] At step 107, there is an optional step of pre-filtering the queue size as 
described previously. 

[0043] At step 109, the assigned queue capacity is determined. Typically, this is 
a given for a particular network configuration but may vary as circumstances warrant, for 
example, if the network is modified. 

[0044] At step 111, an error signal e(n) is calculated as the difference between 
the assigned capacity and the measured (and possibly filtered) queue size. 

[0045] At step 113, a current mark/drop probability Pd(n) is calculated using the 
gain a established at step 101. 

[00461 ^The mark/drop probability calculated at step 113 may be used as the 

mark/drop probability until the next measurement time as tracked by the timer, at which 



point a new mark/drop probability will be calculated. In addition, the filtered queue size 
q(n) , if filtering is used, is stored to be used at the next measurement time. 

[0047] The process may then loop back to step 103 upon timer expiration for 
another iteration of the process. 

[0048] Next is described a method for dropping packets at a queue. As shown in 
the flowchart in FIG. 2, the decision to accept or drop an incoming packet is based on 
the outcome of a comparison of a randomly generated number e [0,1] and the drop 
probability . The procedure can be summarized as follows: 

if q(n) < L , then accept incoming packet 
else, if p^ e [0,1] < p^ , then drop packet, 
J : v r^f^^'^v; else, accept packet ^ . . ■ / . , r 

[0049]<;;. ;; -Conveniently, the parameter L (L^T) may be introdud^a jn th^;control 
process to help maintain high link utilization and keep the queue size around thiB target 
level. The drop controller does not drop packets when q(n)<L in order to maintain high 
resource utilization and also not to further penalize sources which are in the process of 
backing off in response to (previous) packet drops. Note that there is always a time lag 
between the time a packet is dropped and the time a source responds to the packet 
drop. The computation of p^, however, still continues even if packet dropping is 

suspended (when q{n)<L). The parameter L is typically configured to be a little smaller 
than 7, e.g., L can be in the range Z e [0.87, 0.97] . A recommended value is Z = 0.97 . 

[0050] Referring to FIG. 2, upon a packet arrival at the queue, at step 201, a 
detemiination is made whether the queue size q(n) is less than or equal to a 
corresponding no-mark/drop queue threshold L If the size is less than or equal to the 
threshold, then the incoming packet is queued at step 211. 
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[0051] If the size is not less than or equal the threshold, then the routine moves to 
step 205 where a random number g [o,i] is generated. 

[0052] At step 207 a determination of whether the random number is less 
than or equal to the calculated mark/drop probability p^i^) is made. 

[0053] If the probability p^ is less than or equal Pain), then the packet is 
marked/dropped at step 209. If not, the packet is queued at step 21 1 . 

[0054] The process ceases at step 213 until triggered again by the arrival of 
another packet. 

[0055] Following is a description of an enhanced algorithm whiclii retains all the 
features of the basic algorithm in addition to the ability to drop low-precedence packets 
in preference to higher precedence packets. ' ^ > . pp>f ; • 

[0056] An enhanced embodiment consists of multiple instances of the basic 
method. Each instance is associated with a precedence grade. A precedence grade is 
a traffic type having an associated priority. An example precedence grading scheme is 
that of the green-yellow-red color schema. Each instance (one for each priority or color) 
will be controlling traffic in a single queue but with the instances having different 
thresholds. As an example of the enhanced algorithm applied to a case of three 
precedence grades the following parameters are defined: 

(Note: although the discussion here is centered around three drop precedences, 
higher levels of drop precedence (other than three) can be used.) 

[0057] Three queue size counters (used as congestion indicators), one for each 
color, c € {g = green, y = yellow, r = red} , with green packets having higher precedence 
than red or yellow packets, and yellow packets having higher precedence than red 
packets: 

sum of only green packets in the aggregate queue; 
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qy :& sum of yellow and green packets in the aggregate queue; and 
* sum of all packets (of all colors) in the aggregate queue. 

[0058] Long term averages of these queue sizes are maintained as q^, q^, and 

q^ , respectively. Especially note that the sum for a particular color or precedence grade 
is not the specific sum of the packets having that color, but for the purposes here is the 
combined sum of the total of the packets of that color summed with the total of the 
packets of higher precedence grade. Thus, being the queue size of the highest 

precedence grade, is solely the queue size of the green packets whereas q^ is the sum 

of the total of the yellow packets plus the total of the green packets, the green packets 
being of higher precedence grade than the yellow. The same relation holds for the case 
of other than three precedence grades. 

[0059] One queue threshold for all colors,- T = r^ c e {g,j;,r} as illustrated in 
FIG. 3. 

[0060] Three no-drop thresholds, one for each colors, X^, ce{g,y,r}. 
Alternatively, one no-drop threshold could be used for all colors, L = L^, ce {g,y,r} . . 

[00611 Three drop probability computations, one for each color, p^ ^, c e {g^y^r} 

[00621 Under sustained congestion (where traffic of all color are sent to the 

queue), the drop probabilities p^ ^ 405, p^ y 403 and p^ ^ 401 will increase relative to 

each other as shown in FIG. 4. At any given i nstant d uring the congestion, the drop 
probabilities will be p^ ^ < p^ y < p^ ^ . 

[00631 ^The drop probabilities are computed as shown in FIG. 5. As shown in 

FIG.^5 the process initiates, at step 501, at discrete time n=0, by initializing certain 
parameters. A timer is set to At time units, a control gain a is established, and 
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mark/drop probability Pa{n), and initial queue size q{n) are set to initial values for each 
precedence grade. The initial mark/drop probability is used in the mark/drop routine until 
further samples are available. At step 503, the timer is reset to At time units to advance 
to the next discrete time interval. Then at step 505, the current queue size qdn) is 
measured for each precedence grade. 

[0064] At step 507, there is an optional step of pre-filtering the queue as 
described previously. 

[0065] At step 509, the assigned capacity for each precedence grade is 
determined. Typically, this is a given for a particular network configuration, possibly in 
an initialization step, but may vary as circumstances warrant, for example, if the network 
is modified. 

[0066] At step 511, an en'or signal e(n) is calculated as the difference between 
the assigned queue capacity and the measured (and possibly filtered) cumulative queue . 
size for each precedence grade. 

[0067] At step 513, a current mark/drop probability Pd,c(^) is calculated using the 
gain a established at step 501, again a separate probability calculated for each grade. 

[0068] The mark/drop probability calculated at step 513 may be used as the 
mark/drop probability until the next measurement time as tracked by the timer, at which 
point a new mark/drop probability will be calculated. In addition, the filtered queue size 
q{n) , if filtering is used, is stored to be used at the next measurement time. 

[0069] The process may then loop back to step 503 upon timer expiration for 
another iteration of the process. 

[0070] A packet drop routine for packets arriving at a queue running an 
embodiment of the invention is shown in FIG. 6. In general, given that under sustained 
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traffic (with all colors), there is q^<qy<qr. and given that Tg=Ty=T^, red packets will 

get dropped the most and green packets will get dropped the least. Green packets are 
dropped only under severe congestion. 

[0071] Referring to FIG. 6, the routine commences with a packet arrival at step 
601 . Steps 603, 605, and 607 effect a determination as to the color (precedence grade) 
associated with the packet. Steps 613, 615, and 617 each compare the respective 
queue size to a limit parameter to examine if the appropriate queue is less than the limit. 
If the queue is less than the limit, control moves to steps 634, 636, and 638 respectively 
for the packet to be queued, and In order to increment the appropriate queue size. Note 
how queuing a packet will impact the queue size of the queue of its color and all colors 
of lesser precedence. 

[0072] Should the comparison in steps 613, 615, and 617 have determined that 
the respective queue size is in excess* of the limit, a random number is generated at 
steps 623, 625, and 627 respectively. : This random number is compared with the 
previously determined mark/drop probability Pd,c(n) at steps 633, 635 and 637 
respectively. Should the calculated probability Pr be less than the previously determined 
mark/drop probability Pd,c(n), then the packet is dropped at one of steps 643, 645, and 
647 respectively. Should the comparison prove othenA/ise, then the packet is queued, 
and the appropriate queue size is incremented at steps 634, 636, and 638 as described 
previously. Upon the completion of either dropping the packet, or queuing the packet 
and incrementing the queue sizes, the routine ceases at step 650 until a new packet 
arrives. 

[0073] FIG. 7 shows how the queue size counters are updated when a packet of 
a particular color leaves the queue. Referring to FIG. 7, the routine commences with a 
packet departure at step 701. Steps 703, 705, and 707 effect a detenmination as to the 
color (precedence grade) associated with the packet. Appropriate to the color 
(precedence grade), steps 713, 715, and 717 dequeue the packet, and decrement the 
appropriate queue size for the color associated with the packet. Note how a dequeuing 
a packet will impact the queue size of the queue of its color and all colors of smaller 
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precedence. Upon completion of dequeuing and queue size decrementing, the routine 
ends at step 720 until the departure of another packet. 

[0100] While the invention has been described in conjunction with specific 
embodiments thereof, it is evident that many altematives, modifications, and variations 
will be apparent to those skilled in the art in light of the foregoing description. 
Accordingly, It is intended to embrace all such alternatives, modifications, and variations 
as fall within the spirit and broad scope of the appended claims. 



